On the role of duration prediction and symbolic representation for the evaluation of synthetic speech
نویسندگان
چکیده
In order to determine priorities for the improvement of timing in synthetic speech this study looks at the role of segmental duration prediction and the role of phonological symbolic representation in listeners' preferences. In perception experiments using German speech synthesis, two standard duration models (Klatt rules and CART) were tested. The input to these models consisted of symbolic strings which were either derived from a database or a text-to-speech system. Results of the perception experiments show that different duration models can only be distinguished when the symbolic string is appropriate. Considering the relative importance of the symbolic representation, "post-lexical" segmental rules were investigated with the outcome that listeners differ in their preferences regarding the degree of segmental reduction. As a conclusion, before fine-tuning the duration prediction, it is important to calculate an appropriate phonological symbolic representation in order to improve timing in synthetic speech.
منابع مشابه
The Role of Duration Models and Symbolic Representation for Timing in Synthetic Speech
In order to determine priorities for the improvement of timing in synthetic speech this study looks at the role of segmental duration prediction and the role of phonological symbolic representation in the perceptual quality of a text-to-speech system. In perception experiments using German speech synthesis, two standard duration models (Klatt rules and CART) were tested. The input to these mode...
متن کاملRole of Types of Inner Speech in the Prediction of Symptoms of Anxiety, Depression, Somatization, and Distress in the Normal Population
Objective: It is extremely common for adults to use inner speech to regulate their behavior. Despite this, little is known about the underlying processes that may explain why people use inner speech differently. This study aimed to determine the relationship between different types of inner speech with symptoms of anxiety, depression, somatization, and distress in normal people. Methods: The r...
متن کاملThe “kiel Corpus of Read Speech” as a Resource for Prosody Prediction in Speech Synthesis
The naturalness of synthetic speech depends strongly on the prediction of appropriate prosody. For the present study the original annotation of the German speech database “Kiel Corpus of Read Speech” was extended automatically with syntactic features, word frequency, and syllable boundaries. Several classification and regression trees for predicting symbolic prosody features, postlexical phonol...
متن کاملPlanelet Transform: A New Geometrical Wavelet for Compression of Kinect-like Depth Images
With the advent of cheap indoor RGB-D sensors, proper representation of piecewise planar depth images is crucial toward an effective compression method. Although there exist geometrical wavelets for optimal representation of piecewise constant and piecewise linear images (i.e. wedgelets and platelets), an adaptation to piecewise linear fractional functions which correspond to depth variation ov...
متن کاملIranian Women, Inside or Outside of the Stadium? An Anthropological Study on Female Representation of National Identity in Iran
A controversial and comprehensive debate that has resulted in numerous discursive clashes in Iran pertains to the presence of women at stadiums during male soccer matches. Different discourse systems have expressed their own contradictory and opposite stances in terms of whether Iranian women have the right to attend such events inside or outside the stadium, ranging from different notions of r...
متن کامل